AITopics

Country:

North America > United States > Texas (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.95)

Neural Information Processing SystemsFeb-14-2026, 14:19:32 GMT

Breaking the Activation Function Bottleneck through Adaptive Parameterization

Sebastian Flennerhag, Hujun Yin, John Keane, Mark Elliot

Adaptive parameterization is a means of increasing this flexibility and thereby increasing the model's capacity to learn non-linear patterns. We focus on the feed-forward layer, f(x):= φ(W x+b),for some activation functionφ: R 7 R. Define the pre-activation layer as a = A(x):= Wx+band denote byg(a):= φ(a)/athe activation effect ofφgivena, where divisioniselement-wise.

artificial intelligence, arxivpreprint, machine learning, (17 more...)

Country: North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Neural Information Processing SystemsNov-20-2025, 20:02:51 GMT

Breaking the Activation Function Bottleneck through Adaptive Parameterization

Sebastian Flennerhag, Hujun Yin, John Keane, Mark Elliot

Adaptive parameterization is a means of increasing this flexibility and thereby increasing the model's capacity to learn non-linear patterns.

artificial intelligence, deep learning, machine learning, (15 more...)

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Czechia > South Moravian Region > Brno (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Costa, Miguel, Vandervoort, Arthur, Drews, Martin, Morrissey, Karyn, Pereira, Francisco C.

Climate Adaptation with Reinforcement Learning: Economic vs. Quality of Life Adaptation Pathways

arXiv.org Artificial IntelligenceNov-6-2025

Climate change will cause an increase in the frequency and severity of flood events, prompting the need for cohesive adaptation policymaking. Designing effective adaptation policies, however, depends on managing the uncertainty of long-term climate impacts. Meanwhile, such policies can feature important normative choices that are not always made explicit. We propose that Reinforcement Learning (RL) can be a useful tool to both identify adaptation pathways under uncertain conditions while it also allows for the explicit modelling (and consequent comparison) of different adaptation priorities (e.g. economic vs. wellbeing). We use an Integrated Assessment Model (IAM) to link together a rainfall and flood model, and compute the impacts of flooding in terms of quality of life (QoL), transportation, and infrastructure damage. Our results show that models prioritising QoL over economic impacts results in more adaptation spending as well as a more even distribution of spending over the study area, highlighting the extent to which such normative assumptions can alter adaptation policy. Our framework is publicly available: https://github.com/MLSM-at-DTU/maat_qol_framework.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

2511.03243

Country: Europe > Denmark > Capital Region (0.16)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Infrastructure & Services (0.94)
Transportation > Ground > Road (0.69)
Health & Medicine (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)

Neural Information Processing SystemsOct-9-2025, 03:26:17 GMT

Conservative Offline Policy Adaptation in Multi-Agent Games

We prove that CSP learns a near-optimal risk-free offline adaptation policy upon convergence.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Country:

North America > United States > Texas (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceOct-2-2025

Navigating the Synchrony-Stability Frontier in Adaptive Chatbots

Brandt, T. James

Adaptive chatbots that mimic a user's linguistic style can build rapport and engagement, yet unconstrained mimicry risks an agent that feels unstable or sycophantic. We present a computational evaluation framework that makes the core design tension explicit: balancing moment-to-moment linguistic synchrony against long-term persona stability. Using an 8-dimensional style vector and a closed-loop "base+delta" prompting architecture, we simulate and compare explicit adaptation policies - Uncapped, Cap, Exponential Moving Average (EMA), Dead-Band, and Hybrids - on a human-log dataset. Our analysis maps a clear Pareto frontier: bounded policies achieve substantial gains in stability at a modest cost to synchrony. For example, a Hybrid (EMA+Cap) raises stability from 0.542 to 0.878 (+62%) while reducing synchrony by only 17%. We confirm this trade-off through large-scale replications on three public corpora (DailyDialog, Persona-Chat, EmpatheticDialogues) and LLM-in-the-loop validation across two model families. Furthermore, we quantify "prompt legibility," showing that frontier policies reduce instruction churn and cut jarring register flips (major tone changes) from 0.254 to 0.092, yielding systems that are easier to reason about and maintain. Taken together, our framework provides a general evaluation harness for style adaptation; a systematic ablation that identifies Pareto-efficient policies; robust validation across diverse datasets and models; and novel legibility metrics linking policy choices to system maintainability.

large language model, machine learning, natural language, (20 more...)

2510.00339

Country:

Europe (1.00)
North America > United States (0.93)
Asia (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Ren, Allen Z., Dai, Hongkai, Burchfiel, Benjamin, Majumdar, Anirudha

AdaptSim: Task-Driven Simulation Adaptation for Sim-to-Real Transfer

arXiv.org Artificial IntelligenceSep-30-2023

Simulation parameter settings such as contact models and object geometry approximations are critical to training robust robotic policies capable of transferring from simulation to real-world deployment. Previous approaches typically handcraft distributions over such parameters (domain randomization), or identify parameters that best match the dynamics of the real environment (system identification). However, there is often an irreducible gap between simulation and reality: attempting to match the dynamics between simulation and reality across all states and tasks may be infeasible and may not lead to policies that perform well in reality for a specific task. Addressing this issue, we propose AdaptSim, a new task-driven adaptation framework for sim-to-real transfer that aims to optimize task performance in target (real) environments -- instead of matching dynamics between simulation and reality. First, we meta-learn an adaptation policy in simulation using reinforcement learning for adjusting the simulation parameter distribution based on the current policy's performance in a target environment. We then perform iterative real-world adaptation by inferring new simulation parameter distributions for policy training, using a small amount of real data. We perform experiments in three robotic tasks: (1) swing-up of linearized double pendulum, (2) dynamic table-top pushing of a bottle, and (3) dynamic scooping of food pieces with a spatula. Our extensive simulation and hardware experiments demonstrate AdaptSim achieving 1-3x asymptotic performance and $\sim$2x real data efficiency when adapting to different environments, compared to methods based on Sys-ID and directly training the task policy in target environments. Website: https://irom-lab.github.io/AdaptSim/

target environment, task policy, trajectory, (15 more...)

2302.04903

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)

#artificialintelligenceJun-27-2022, 22:46:53 GMT

Reinforcement learning based adaptive metaheuristics

Parameter adaptation, that is the capability to automatically adjust an algorithm's hyperparameters depending on the problem being faced, is one of the main trends in evolutionary computation applied to numerical optimization. While several handcrafted adaptation policies have been proposed over the years to address this problem, only few attempts have been done so far at apply machine learning to learn such policies. Here, we introduce a general-purpose framework for performing parameter adaptation in continuous-domain metaheuristics based on state-of-the-art reinforcement learning algorithms. We demonstrate the applicability of this framework on two algorithms, namely Covariance Matrix Adaptation Evolution Strategies (CMA-ES) and Differential Evolution (DE), for which we learn, respectively, adaptation policies for the step-size (for CMA-ES), and the scale factor and crossover rate (for DE). We train these policies on a set of 46 benchmark functions at different dimensionalities, with various inputs to the policies, in two settings: one policy per function, and one global policy for all functions.

adaptation policy, algorithm, reinforcement, (1 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.63)

arXiv.org Artificial IntelligenceJun-8-2021

A critical look at the current train/test split in machine learning

Tan, Jimin, Yang, Jianan, Wu, Sai, Chen, Gang, Zhao, Jake

The randomized or cross-validated split of training and testing sets has been adopted as the gold standard of machine learning for decades. The establishment of these split protocols are based on two assumptions: (i)-fixing the dataset to be eternally static so we could evaluate different machine learning algorithms or models; (ii)-there is a complete set of annotated data available to researchers or industrial practitioners. However, in this article, we intend to take a closer and critical look at the split protocol itself and point out its weakness and limitation, especially for industrial applications. In many real-world problems, we must acknowledge that there are numerous situations where assumption (ii) does not hold. For instance, for interdisciplinary applications like drug discovery, it often requires real lab experiments to annotate data which poses huge costs in both time and financial considerations. In other words, it can be very difficult or even impossible to satisfy assumption (ii). In this article, we intend to access this problem and reiterate the paradigm of active learning, and investigate its potential on solving problems under unconventional train/test split protocols. We further propose a new adaptive active learning architecture (AAL) which involves an adaptation policy, in comparison with the traditional active learning that only unidirectionally adds data points to the training pool. We primarily justify our points by extensively investigating an interdisciplinary drug-protein binding problem. We additionally evaluate AAL on more conventional machine learning benchmarking datasets like CIFAR-10 to demonstrate the generalizability and efficacy of the new framework.

active learning, iteration, learning, (13 more...)